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Vector auto-regressive (VAR) models typically form the basis for constructing directed 
graphical models for investigating connectivity in a brain network with brain regions of 
interest (ROIs) as nodes. There are limitations in the standard VAR models. The number 
of parameters in the VAR model increases quadratically with the number of ROIs and 
linearly with the order of the model and thus due to the large number of parameters, 
the model could pose serious estimation problems. Moreover, when applied to imaging 
data, the standard VAR model does not account for variability in the connectivity structure 
across all subjects. In this paper, we develop a novel generalization of the VAR model that 
overcomes these limitations. To deal with the high dimensionality of the parameter space, 
we propose a Bayesian hierarchical framework for the VAR model that will account for 
both temporal correlation within a subject and between subject variation. Our approach 
uses prior distributions that give rise to estimates that correspond to penalized least 
squares criterion with the elastic net penalty. We apply the proposed model to investigate 
differences in effective connectivity during a hand grasp experiment between healthy 
controls and patients with residual motor deficit following a stroke. 
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1. INTRODUCTION 

The analysis of brain networks has played an important role in 
characterizing and understanding brain processes and diseases 
(Bassett and Bullmore, 2009). Pollonini et al. (2010) showed 
that connectivity between different regions of the brain can dif- 
ferentiate between subjects with autism from healthy controls, 
and they suggested that connectivity patterns may provide an 
indicator for the early detection of autism. Wu et al. (2010) 
showed that the effective connectivity between the motor regions 
changes as movements become more automatic in patients with 
Parkinsons disease. Wang et al. (201 1) showed that HIV infection 
has an effect on resting-state connectivity. Benetti et al. (2009) 
and Skudlarski et al. (2010) showed the effects of schizophre- 
nia on brain connectivity. Because of the clinical implications 
of connectivity studies, it is imperative to advance the statisti- 
cal methodologies for connectivity analyses. In this paper, we 
develop the hierarchical vector- auto regressive (VAR) model to 
study alterations in brain effective connectivity in patients with 
chronic stroke. We shall demonstrate that the hierarchical VAR 
has a number of advantages: (i) it offers a flexible statistical frame- 
work for comparing connectivity across experimental conditions 
(e.g., active vs. rest) and subject groups (e.g., healthy vs. dis- 
ease); (ii) it quantifies the extent to which a covariate (such as age 
or genotype) can modify or moderate connectivity; and (hi) it 
correctly accounts for between-subject heterogeneity in the con- 
nectivity structure by including subject- specific parameters in the 
connectivity model. 



Brain networks are often characterized by two types of connec- 
tivity, namely, functional connectivity and effective connectivity, 
broadly defined in Friston (2004) as follows: functional connec- 
tivity is the temporal correlation between remote neurophysio- 
logical events while effective connectivity evaluates the influence 
that one neural system exerts over another. Functional connec- 
tivity is typically quantified by the cross-correlation between the 
time courses obtained from regions of interest (ROIs), but does 
not explore any lead-lag relationships between these ROIs. In this 
paper, we focus on effective connectivity, studying lead-lag rela- 
tionships where the directionality is determined by the temporal 
sequence in the model. 

The statistical approaches that provide information about 
the directionality of associations are structural equation mod- 
els (SEM) (Mcintosh and Gonzalez Lima, 1994), dynamic causal 
models (DCM) (Friston et al., 1993) and Granger causality mod- 
els (GC) (Granger, 1969; Goebel et al, 2003; Roebroeck et al, 
2005). DCM attempts to infer the temporal sequence of events, 
and possibly non-linear dependence, at the neuronal level. In 
DCM, one estimates the dependence and directionality of the 
neural source at the millisecond level using functional magnetic 
resonance imaging (fMRI) data which is observed every 1-2 s. To 
achieve inference on the temporal dynamics of a neural system, 
DCM must make assumptions that include (a) the specification 
of a generative model that maps the neuronal level activity to the 
observed fMRI signal and (b) an a priori model network struc- 
ture describing the anatomical inter- regional structure and the 
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effects of stimulus perturbations over the defined network. There 
are a number of problems with the DCM approach. It requires 
heavy computing time and thus the number of candidate mod- 
els that can be considered for comparison has to be constrained. 
Moreover, it is difficult to assess the validity of the physiological 
assumptions and goodness of fit of the estimated dynamics of the 
neuronal signal because the observed fMRI data is recorded at a 
time scale (usually 1-2 s) that is coarser than that of the inferred 
neuronal dynamics. The SEM approach aims to describe the 
covariance of the observed fMRI data given a predefined structure 
over a set of selected regions. The model coefficients give informa- 
tion of the expected change in hemodynamic activity of one ROI 
influenced by a unit change in another ROI. However, with SEM, 
it is difficult to estimate cyclical connections, i.e., those involv- 
ing feedback relationships. The main drawback of both SEM and 
DCM is that they rely on a priori specifications of one or sev- 
eral models. Therefore these methods are mainly used to confirm 
a priori hypotheses about a brain network rather than explore or 
identify network connections. In practice, it is not always possi- 
ble to have a priori information on the structures of the networks. 
This is especially difficult when the number of nodes (or ROIs) in 
the network is large. 

Under these scenarios, Granger causality via the VAR model 
is a viable option for exploring condition and covariate-specific 
effects on effective connectivity from the observable data. We 
point out, however, that "causality" may not necessarily be physi- 
ological. Rather, it is statistical in nature: if the hypothetical model 
A, which uses data in ROI 1 to predict activity at ROI 2, gives a 
more accurate and precise prediction compared to a hypothetical 
model B which does not include ROI 1, then we say that activity 
at ROI 1 "Granger-causes" activity at ROI 2 (Granger, 1969). The 
VAR model is the standard framework for investigating Granger 
causality. It has been widely applied to different brain signal 
modalities, including fMRI time courses (e.g., Goebel et al., 2003; 
Harrison et al., 2003; Roebroeck et al., 2005; Gorrostieta et al., 
2011) and EEG time series (e.g., Kaminski and Blinowska, 1991; 
Prado and Huerta, 2002; Fiecas and Ombao, 2011). However, 
there are also a number of controversial points related to GC 
within the VAR framework approach. When it is applied to fMRI 
signals, VAR models could identify spurious relationships because 
fMRI signals (observed at a coarse temporal resolution) are con- 
volved and delayed versions of the neuronal signals (unobserved 
processes that unfold at a millisecond scale). Thus, the VAR can- 
not identify lags or delays in the neuronal process that are smaller 
than the temporal resolution of the fMRI data. For example, a 
true delay of 200 and 500 ms at the neuronal level might not be 
distinguishable from fMRI data which were sampled every 1 s. 
In this paper, we emphasize that our proposed hierarchical VAR 
will be used to assess for Granger causality at the hemodynamic 
level rather than neuronal. Thus, when our analysis suggests that 
ROI 1 "Granger- causes" ROI 2, we conclude from VAR mod- 
eling that the hemodynamic activity at ROI 1 "Granger-causes" 
the hemodynamic activity at ROI 2. We do not make any con- 
clusions about the underlying neuronal dynamics. While brain 
scientists are often more interested to make inference on neu- 
ral activity, we point out that connectivity at the hemodynamic 
level can also yield interesting results. One more restriction of the 
VAR model is that it can only be applied to time series data with 



stationary periods, e.g., resting-state fMRI time courses or fMRI 
experiments with a block design. Thus, in this paper we center our 
proposed model in the context of two experimental conditions 
presented during blocks of time. Finally, although the notion of 
GC is not restricted to VAR models, it is often implemented under 
that context. Thus, directed links are currently restricted only to 
linear associations. We are now developing extensions of the VAR 
model to functional VAR model which has the potential to capture 
non-linear types of dependence. 

Between- subject variation in brain responses plays an impor- 
tant role in the analysis of brain networks and must be accounted 
for in the statistical model and inference because it can have 
a significant impact in the analysis. This, in fact, is one of the 
challenges in the Human Connectome Project (Van Essen and 
Ugurbil, 2012). Under DCMs, one can account for this hetero- 
geneity via a random effects analysis. For SEM and VAR models, 
there is no standard statistical approach for group -level analy- 
sis. For the latter, Deshpande et al. (2009) performed group-level 
inference by combining the p-values obtained from individual 
subjects using Fisher's method (Fisher, 1932) to obtain a sin- 
gle p-value. However, this approach does not report the extent 
of between-subject variability in effective connectivity. A nat- 
ural approach in multi- subject analysis is to proceed with the 
estimation of connectivity parameters in two stages: in the first 
stage subject- specific parameters are estimated and on the sec- 
ond stage between- subject variations in the estimates from the 
first stage are obtained. The two-stage approach is known in the 
statistical literature to be sub -optimal because information is lost 
when summarizing the original vector- valued time series for the 
each subject with their connectivity parameters. A reduction of 
information occurs, for example, by the omission of the subject- 
specific covariance matrix of parameters in the second stage, 
whose estimate is dependent on the length of the time series. In 
addition, when we have a large number of potential predictors 
in the model and the aim is to identify the important predictors, 
the implementation of the two-stage procedure is not immediate 
since this procedure does not take into account the large number 
of parameters. Moreover, the inclusion of a penalization crite- 
rion for parameter estimation in the two-stage approach is not 
direct, since in the first stage each subject would have his own set 
of selected predictors, potentially different across subjects, and in 
the second stage special attention should be considered in propos- 
ing a summary statistic that takes into account the sparsity in the 
parameters in the group of subjects. In this paper, we build on 
the approach in Gorrostieta et al. (2011) where they proposed 
the mixed- effects VAR model that allowed the effective connec- 
tivity structure to vary across subjects. Here, we adopt a Bayesian 
approach for statistical modeling and inference. 

In addition to the group -analysis inference, there are other 
challenges to modeling effective connectivity using the VAR 
model. For a brain network with .R ROIs, one would need R 2 con- 
nectivity parameters per time lag in the VAR model. Thus, one 
important problem with the VAR model is that the number of 
parameters grows quadratically with the number of ROIs con- 
sidered in the analysis and linearly with the order of the model. 
Due to the large number of parameters and the collinearity of 
the regressors of the VAR model, even in the case where there is 
pre-defined number of brain regions to be included in the model 
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estimation procedures via ordinary least squares could lead to 
numerical problems, unstable results, and lack of interpretabil- 
ity. Moreover, fMRI time courses for each subject are recorded for 
small periods of time for each stimulus in the experiment, lead- 
ing to relatively short time courses to be used for estimating an 
effectivity connectivity network per experimental condition. One 
of the most common approaches to manage the large number 
of parameters in the VAR model is penalized regression (Valdes- 
Sosa et al, 2005; Andrew Arnold, 2007; Martinez- Montes et al, 
2008; Davis et al, 2012). In the present work, we present a 
model with an estimation procedure that follows the framework 
of penalized regression while accounting for between subject vari- 
ability. Even for the classical linear model only a few methods 
exist that address both the high- dimensionality of the parame- 
ter space and the modeling of subject- variation (Bondell et al., 
2010; van de Geer, 2010; Fan and Li, 2012). However, these prob- 
lems have not been addressed and explored in a VAR model 
context. 

Despite the ability of the regularization methods to handle a 
relatively large number of parameters, we caution that it should 
not give the false sense of security to liberally choose any arbitrar- 
ily large model order for the VAR model with the aim of capturing 
the full temporal dynamics of fMRI time courses. To objectively 
select the best order, one could use some information -theoretic 
criterion. Another practical approach is to first fit a VAR(l); if the 
autocorrelation plots of the residuals look reasonably like white 
noise then stop at order 1; if not then continue by fitting a VAR(2) 
and so on. 

The purpose of this paper is to address the aforementioned 
problems in investigating effective connectivity. Specifically, we 
aim to extend the VAR model for estimating directed graphs that 
account for inter-subject variability as well as the high dimen- 
sionality of the parameter space. Currently, there is no standard 
way for this purpose. Here, we develop a novel methodology that 
puts the VAR model in a Bayesian hierarchical modeling frame- 
work that naturally permits modeling sources of variability within 
and between subjects. We appropriately specify the prior distri- 
butions over the parameters of the VAR model in order to achieve 
an equivalent elastic net penalization approach as developed by 
Zou and Hastie (2005), and controlling in this way for the large 
number of parameters and the collinearity in the VAR model 
regressors (Kyung et al., 2010; Li and Lin, 2010). The modeling 
strategy also provides a practical Gibbs sampling scheme that is 
relatively easy to implement. 

The remainder of the manuscript is organized as follows. We 
first formulate our model and develop the Bayesian modeling and 
inference procedure. This is followed by an application of our 
proposed method to an fMRI data set collected from a group 
of participants in a clinical study to determine alterations in 
effective connectivity due to stroke. We conclude with a discus- 
sion that highlights the strengths and advantages of the proposed 
hierarchical VAR model. 

2. THE HIERARCHICAL MODEL FOR EFFECTIVE 
CONNECTIVITY 

To investigate effective connectivity, we develop a hierarchi- 
cal VAR model formulated under the a Bayesian modeling 



framework. The model has parameters that characterize exper- 
imental condition, group and subject- specific cross-dependence 
between the £ ROIs. The key advantages of the proposed hierar- 
chical VAR model are the following: (i) it permits the use of many 
parameters, which is necessary for characterizing the dependence 
structure in data derived from complex processes such as fMRI 
time series, allowing efficient estimation of parameters even in the 
high dimensional setting and under high multi- collinearity in the 
regressors, (ii) it quantifies between-subject variations in connec- 
tivity, (iii) it identifies Granger causality both at the group and 
subject level, as well as characterize Granger causality by exper- 
imental condition, and (iv) it permits testing for differences in 
connectivity between patient groups and between experimental 
conditions. 

2.1. SINGLE SUBJECT MULTIPLE-STIMULI VAR MODEL NOTATION 

In the context of fMRI time series, it is common to register the 
data according to the timing of the presentation of the stimulus. 
We allow the VAR coefficients to vary according to the experi- 
mental conditions. Moreover, we will assume that following the 
presentation of condition A at time point t, the brain effective 
connectivity for condition A is activated and sustained until the 
future time when condition B is presented. Thus, our VAR model 
has coefficients that change over time (according to the timing of 
stimulus presentations), but are constant within a local interval 
until a different condition is presented. We formalize the above 
ideas by defining the parameters of the VAR model for each con- 
dition with the indicator functions W^\t) and W$\t). Suppose 
that condition A was presented at time t\ and condition B at time 
ti + M. Thus, from time t\ to t\ + M - 1, W^\t) takes on the 
value of 1 and (t) takes on the value of 0. At time t\ + M, 
we have (ti + M) = 0 and V\^ 5) (t\+M) = 1. Then the VAR 
model of order K for participant s with 2 conditions A and B is 
defined as 

Y®(f) = E (<X S) « + Y(s) (' - fc ) + e(s) « 
k=i 

(i) 

where Y (5) (0 = [Y[(t), . . . , Y s R (t)]', R is the number of ROIs, 
5=1, ...,S; and t=l,...,T s . The random vector e^ s \t) 
is white noise with Ee (5) (0 = 0 and Cove (s) (0 = £ T = 
diagft^ 1 , . . . , Xg 1 } does not change over time and it is assumed 
to be constant across all participants. The subject- specific connec- 
tivity parameters are defined by the components of the matrices 
<DaV &® k . Under condition A, the fMRI time series Y (5) (0 fol- 
lows a VAR model of order K and the dependence structures 
under this condition are summarized in the lag- specific con- 
nectivity matrices 4>^ for k = 1 , . . . , K. For condition B y these 
connectivity matrices are The number of mean parameters 
under the above model is q = R 2 x K x 2. 

In the classical linear model notation, we rewrite the VAR 
model for participant 5 as, 

y (5) = X (5) c|) (5) + e (5) , 5 = 1, . . . , S, e (5) - N(0, 2 T ® 1 T -k) 

(2) 
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where the matrix of dimension (T — K)R x q is obtained by 
stacking the £ x q time dependent matrices x^(f), defined for 
t = K+ 1, Tas 



.that 



Ir ® [w^(t)Y (s) (t -\Y ... W%\t)Y (s) (t - KY 
W^(t)Y (s) (t -\Y ... W^\t)Y (s) (t - , 



X® := 



~x^(X + 1) 

x (s) (T) J 



and y (5) := 



Y^(K+ 1)" 
Y (5) (T) 



2.2. HIERARCHICAL STRUCTURE 

To model inter- subject variability and derive inferences at the 
subject and group level, we describe the subject- specific vector of 
parameters as ty^ = (|) + where ty^ is a vector of dimension 
q = R 2 x K x 2, defined as, 



(s) 



vec 



|_ A,l 



A,i<C 



(3) 



The vector j;^ represents the subject- specific deviations such 
that ? (5) |D ~ Nq(0, D), and D = diag{d~\ d~\ d~ 1 }. The 
elements of D quantify the variability between subjects in the 
connectivity parameters. 

Because of the subject- specific parameters in the model, the 
co variance matrix also varies across subjects, and in fact, it is com- 
pletely determined by the subject- specific coefficients (j/^ and the 



noise co variance E T = diag{t 1 



, x R }. To show this, adopt- 



ing the matrix notation of connectivity parameters established in 
Equation 1, first we define M s := [^\w[ s \t) + ^k vv 2 
Then we rewrite the model as, 



Y (s) (0 =M s Y w (f- l) + e w (f) 



As) 



(*)< 



where, 



E(e (5) (0) 



: 0, Cov(e (5) (0, e /(5) (0) = £ TJ and 
Cov(e^(f)> e r ^ s \k)) = 0. Therefore, following the procedure 
detailed in Lutkepohl (2005), the subject-specific covariance 
matrix is given by 



Cov(Y (s) (0) = ^MjXtMf 



(4) 



where the matrix represent M to the i-th power and it is 
absolutely summable under the vector autoregressive framework. 
From Equation (4), now we can see clearly that the covari- 
ance matrix varies across subjects, which is a consequence of the 
subject-specific coefficients. 

The priors over the group connectivity parameters (|) are 
defined as suggested in Kyung et al. (2010) and Li and Lin (2010) 
to achieve the equivalent elastic net estimators. The proposed 
prior distribution is 



for V T := diagjjtj 1 1 2 rk • • • i r 1 12Rk]\ where 1 2 rk is a vector 
of ones with dimension 1 x 2RK and x r ~ Gam(r Xr , h Xr ); 
Vd> := diag{(ai + X 2 )~\ (a 2 + X 2 )~\ . . . , (c^ + X 2 ) -1 } and 
ay I X2 , y ~ (ay/ (ay + X 2 )) 1 / 2 InvGamma(l, y/2), where X 2 ~ 
h\) and y ~ Gam(r v , h Vr 

in the vector (|) linked 
x r L via the definitions of Vo and V x . 
N(0, x~ l (aj + X 2 ) -1 )} be the subset of 



Gam(rx, /i),) and y ~ Gam(r y , h y ). 

There is a subset of parameters 
to the variance ~ r ~ l 
Let (|) (r) := {<|>y : 



parameters associated with x r . For the corresponding 

normal distribution given in Equation (5) is equivalent to 
4>(r)|x r oc exp {- \ {2Jyx~r\ fy(r) I + ^2*A \<b(r) 1 1 2 ) }, and this prior 
distribution is equivalent to the elastic net approach com- 
monly used for variable selection (Li and Lin, 2010). Thus, 
for the vector (|), we have the prior distribution (|)|V X oc 

exp j-j (2^\Jv^$\ + X 2 ||V- 1 (|)|| 2 ^ J, which is equivalent 

to the elastic net. 

A graphical description of the hierarchical structure is pre- 
sented in (Figure 1). Note that even though we shrink the group 
connectivity parameters toward zero via their conditional pri- 
ors, we allow the subject specific connectivity parameters to be 
different from zero per the subject- specific deviations In 
this way we impose a sparse structure at the group level while 
simultaneously allowing for subject- specific deviations from this 
structure. 

Now we describe the full conditional distributions that we 
used in our Gibbs sampler. First, the structure of the posterior 
distribution of the model is 



fl ^y(X (5) (|> + Z (5) S (5) , E T <8> I T -k)N+(0, V x Vd>) 



5=1 

q 

N % (0, D)Y\G^m(r dp h dj ) 

R q 

Y\ Gam(r Xw , h Xn ) [](«//(«/ + X 2)) 1/2 

n=l j=l 

InvGam a; (l, y/2)Gam(rx, h\) Gam(r Y , h y ). 



(6) 




4>|V T , V* ~ N(0, V T V<j>), 



(5) 



<£|V T oc exp{-± (2/y| yfv?<t>\ + A 2 ||V-VH 2 )} 



FIGURE 1 | Graphical model of proposed hierarchical structure, arrows 
represent parameter dependence, blue fill indicates equivalent 
penalization parameters in the elastic net setting, green fill indicates 
matrices with parameters of variance. 
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To describe the conditional distributions for the Gibbs sampling, 
we will use the following matrix notation obtained by stacking the 
vectors and matrices stated in Equation (2): 



/y(D\ 

1,(2) 



\y (S V 

| (2) 



X = 


X (2) 


, z = 


0 


0 . 

z< 2 > . 


.. 0 \ 
.. 0 




/d 0 


... o\ 


1 0 


0 . 


. . Z< S >y 



D* = 



0 D ... 0 



0 0 ... D 



I x = E T ®ls(T-K)- 

From Equation (6), we obtain next the full conditional 
distributions: 

(i) group effective connectivity (|) ~ N([L^, V), where 

H = (x / i7 1 x + v t - 1 v- 1 )- 1 x / i7 1 (y-^), and 

V^XVX + V^V" 1 )- 1 ; 

(ii) subject- specific deviations % ~ N([i^, V^), where 



= (Z'l- l Z + D*- l y l Z'l- l (y-X$), and 
= (Z^Z + D* -1 ) -1 ; 
(iii) the variance components of 




f(T-K)S + 2RK + 2r Xr 

x r ~ Gamma , 

V 2 



where 



W( r) := {Wj € w : l x (i, i) = x r '}, 
w := if - X<|> - Zi;')V - X* - Z?') 



with 



? (r) := {?; e ? : V T (,; j) = x" 1 } < := f V" 1 *, 
X 2 ~ Gamma I r x + |, ]- ^ x/,^ 2 + /ix. J , and 



/ 1 9 \ 

y ~ Gamma I r y + q, - ^ a. 1 + h y J 

V 2 ;=i / 



(iv) f/ze variance components of 

dj ~ Gamma (s(r d - 1/2) + 1, i ^>f ) 2 + Sfc^ , 

this for the case of having set the prior distribution of dj as 
dj ~ Gamma(rj, /z^). 

The parameters in all Gamma distributions correspond to the 
parameters of shape and rate respectively. 

2.3. EFFECTIVE CONNECTIVITY NETWORKS 

2.3. 1. Group-specific effective connectivity networks 

We infer a group -specific Granger causality network by select- 
ing the parameters whose estimates survive some significance 
threshold, in our case we use the 95% credible region crite- 
ria. To construct the group -specific Granger causality network 
for experimental condition A, we draw a directed edge from 
region r to a region r' to representing Granger- causality if the 
95% credible region of the lag-joint parameter distribution of 
[^\(r\ r), . . . , $A K (r', r)] does not include the origin. In the 
case of a VAR model of order 2, a credible region from the poste- 
rior samples at lag 1 and lag 2 of a specific connectivity parameter 
can be obtained by constructing an empirical 3 -dimensional his- 
togram, and a 95% credible region is defined by the 0.025 contour 
level line of this histogram. 

To compare Granger causality across two experimental condi- 
tions A and B, at each link from region r to r r , we calculate the 
differences per lag, ^(V, r) - ^(r 7 , r), . . . , *^ lc (r', r) - 

®B V( f/ ' r ) an< ^ declare a difference in Granger causality to be 
significant when the credible regions associated to the lag- joint 
parameter distribution of the differences do not include the ori- 
gin. Similarly, we compare Granger causality across groups from 
the lag- joint parameter distribution of the differences between 
groups. 

2.3.2. Subject-specific effective connectivity networks 

To construct a subject-specific Granger causality network, we 
use the individual connectivity coefficients determined by the 
group connectivity coefficients plus the subject deviation: <\>^ = 
(|) + S (s) . We then proceed to build the network in a manner that 
is analogous to that at the group level. Our approach to test 
for differences in Granger causality at the subject level between 
experimental conditions is similar to the approach at the group 
level. 



2.3.3. Subject variability over connectivity coefficients 

We quantify the variance among connectivity coefficients via the 
parameters diagf^^ 1 , d^ 1 , . . . , d~ 1 }. Specifically, the parameter 



dj represents the variance among subjects for the corresponding 
connectivity coefficient. Where the correspondence is imposed by 
the order given in Equation (3). If the variance is very small rel- 
atively to the population value of ^cond.,k( r ^ r )> men we can sa Y 
that the variations across subjects for the associated link can be 
ignored. 
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2.3.4. Modeling multi-group effective connectivity 

If we have several groups in the analysis, we can state the pro- 
posed model to allow for group-specific connectivity parameters. 
To illustrate, suppose we have two groups and each group has 

its own connectivity parameters. Let <\> := 

Grp.l ^Grp .2] where 

^Grp.j are me connectivity coefficients for group j, 4>Grp.j : — 
vec [&a,i • • • ®a,k ®b,i • • • • To include each group's own 

connectivity coefficients, we re-define the matrices X, D* as 



X = 







X (2) A (2) 






, D* = 


yX^A^J 




fDGrp.A 


^D Grp . 2 ; 



0 A (2) D 



\ 



0 ... A^D 



and 



where A^ = [i^ 0] if subjects belongs to group 1, A^ = [0 1^] if 
subject s belongs to group 2, and DGrp.j is the diagonal matrix that 
quantifies the inter-subject variability among subjects from group 
j. Therefore, the conditional posterior distributions of the param- 
eters in DGrp.j are slightly modified by considering the subjects in 
group;'. 

3. EFFECTIVE CONNECTIVITY ANALYSIS IN A fMRI STROKE 
STUDY 

To demonstrate the utility of the proposed model, we analyzed 
functional magnetic resonance imaging (fMRI) time series data 
collected from healthy controls and stroke patients with residual 
motor deficit. Chronic stroke is associated with well- documented 
bilateral changes in fMRI activation strength and volume within 
the motor system. However, it is possible that there may be 
stroke -related differences in connectivity both with and without a 
stimulus that are not equally reflected in changes in fMRI activa- 
tion strength or volume (Rehme and Grefkes, 2013). Our goal in 
this paper is to develop a flexible statistical model for investigating 
these contrasts in effective connectivity. 

3.1. PARTICIPANTS 

Entry criteria include age > 18 years, ischemic stroke 1 1-26 weeks 
prior to first study assessments, and some residual motor deficit — 
Action Research Arm Test score <52 (range = 0-57, normal = 57) 
OR 9 hole-peg test score <25% of the unaffected hand score. In 
this analysis, there were two groups: 16 stroke patients with the 
right side of their body affected and 13 healthy participants that 
served as control group. 

3.2. TASKS, APPARATUS AND PROCEDURES 

A Philips Achieva 3.0T MRI whole-body scanner was used 
to collect patients' imaging data. High- resolution Tl -weighted 
images were acquired using a 3D MPRAGE sequence (rep- 
etition time (TR) = 8.5 ms, echo time (TE) = 3.9 ms, flip 
angle = 8, field of view (FOV) = 256 x 204 x 150, slices = 
150, voxel size = 1 x 1 x 1mm 3 ). Blood oxygenation level- 
dependent (BOLD) images were acquired using a T2* -weighted 
gradient-echo echo planar imaging sequence (TR = 2000ms, 
TE = 30 ms, flip angle = 70, FOV = 240 x 240 x 154, slices = 



31, voxel size = 2 x 2 x 2 mm 3 ). The MRI protocols were the 
same for each patient. Functional magnetic resonance imaging 
(fMRI) contrasted affected hand grasp -release movements (active 
condition) with rest condition. 

For the participants in the healthy control group, the exper- 
iment was divided into 2 sessions each with 48 fMRI scans, 
alternating 12 scans between the two conditions, always starting 
with rest condition. Therefore the total time series points that 
were considered in the analysis for each participant in this group 
is T = 2 x 48 = 96. For participants within the right side affected 
group, the experiment was divided into 3 sessions each with 48 
fMRI scans, alternating 12 scans between the two conditions but 
always starting with rest condition. Therefore the total time series 
points that were considered in the analysis for each participant in 
this group is T = 3 x 48 = 144. 

3.3. SELECTION OF REGIONS OF INTEREST 

Using the Marsbar toolbox (Brett et al., 2002), bilateral ROIs were 
created within primary motor cortex (Ml) and dorsal premotor 
cortex (PMd), and a midline supplementary motor area (SMA) 
ROI. The mean coordinates of the ROIs were taken from a meta- 
analysis by Mayka et al. (2006). Each of these regions was defined 
by a 6.0 mm radius sphere around the specific local coordinates. 
We summarized the information across the 110 voxels in each 
ROI by the mean of fMRI time series. We consider the mean 
time course within the ROI is a good representative of the BOLD 
activity in each specified ROI because the images were smoothed 
during the preprocessing step (as described below). The ROIs 
constructed for the analysis are small — spheres with a radius of 
6 mm — relative to the Gaussian smoothing kernel having FWHM 
of 8 mm used to smooth the images. Thus, the neighboring voxels 
within each ROI in the smoothed image are similar to each other. 
The ROIs considered in our analysis are localized as in (Figure 2). 
To illustrate the proposed model, we focus only on these 5 prese- 
lected regions which are known to be highly involved in the task 
experiment after stroke. 

3.4. PREPROCESSING STEPS 

Functional data from all the sessions were preprocessed using 
SPM8 software (Wellcome Trust Center for Neuroimaging, www. 
fil.ion.ucl.ac.uk/spm). Preprocessing steps included realignment 
to the first image, coregistration to the mean EPI image, normal- 
ization to the standard MNI EPI template, and spatial smooth- 
ing (FWHM = 8 mm). Data was visually inspected for head 
movement. Data was rejected for patients with >2mm head 
displacement. 

Previous to the VAR model fitting, we removed expected 
trends on averaged fMRI time series from each ROI and each 
participant. The expected trends were obtained per subject and 
per ROI by fitting the linear model with the following regres- 
sors: (1) the expected BOLD condition-specific response X c (t), 
defined as the convolution between the canonical hemodynamic 
response function (difference of two gamma functions) and the 
indicator function for condition c (in our case, rest and active); 
(2) drift components which were defined as polynomials up to 
the order three; and (3) seasonal components (sines and cosines 
at frequencies below 0.25 Hz and above 0.5 Hz) to remove car- 
diac and respiratory effects. To analyze connectivity, the proposed 
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Granger Causality for Healthy Controls 

► REST 
► ACTIVE 



Granger Causality for Stroke Patients 




DIFFERENCES BETWEEN 
HEALTHY CONTROLS AND 
STROKE PATIENTS DURING 
ACTIVE 




FIGURE 2 | Links represent Granger causality connections determined 
by the 95% credible regions defined by the 0.025 contour level of 
the lag-joint sample distribution of connectivity coefficients on each 



group. Brown edge is significantly different between healthy control 
group and stroke group in the Granger causality networks during active 
condition. 



hierarchical VAR model was fit to the ROI- specific and subject- 
specific residuals which were obtained by removing the estimated 
from the observed average fMRI time series. 

3.5. TIME SERIES FOR THE PROPOSED HIERARCHICAL CONNECTIVITY 
MODEL 

Our data consists of subject- specific multivariate time series 
(with 5 components representing brain activity in the selected 
5 ROIs) from two groups: 13 participants in the healthy con- 
trol group and 16 patients in the stroke group. The length of 
time series in the control group is 96, while in the stroke group 
is 144. Note that fitting a VAR model of order (maximal lag) 
K having connectivity parameters representing each condition 
(rest/active) requires a total of 2 x K x 25 parameters per group 
(control/stroke), plus the parameters that describe the individ- 
ual deviations from the group. Therefore in this case, there is a 
need to control for the large number of parameters to be consid- 
ered in the model. We fit the proposed hierarchical VAR model of 
order K = 2 while accounting for the large number of parameters 
via elastic net. The first order VAR model, is often recommend- 
able for fMRI time series, given the low temporal resolution on 
this data. There are a number of papers that suggest a VAR order 
1 can sufficiently capture the covariance structure in a multi- 
ROI fMRI data (Martinez-Monies et al, 2004; Valdes-Sosa et al, 
2005). Here, we first fit the VAR(l) model but the diagnostics 
(based on the auto -correlation of the residuals) suggest that it 
does not sufficiently capture the temporal dynamics. As a next 
step, we fit a VAR(2) model which turned out to be adequate 
for the stroke data at hand. The model was implemented by 
a Gibbs sampling procedure; there were generated 80000 iter- 
ations. The first 60000 iterations were discarded. For inference 
purposes, the last 20,000 posterior samples were thinned by sam- 
pling every 5-th sample to get the posterior distribution sample. 



We investigated the convergence of the samples by inspect- 
ing trace plots and via the Geweke technique as described in 
Brooks and Roberts (1998), we do not find evidence of lack of 
convergence. 

3.6. RESULTS OF EFFECTIVE CONNECTIVITY ANALYSIS 

3.6. 1. Group-specific granger causality networks 

Effective connectivity group-networks were constructed in terms 
of the Granger causality concept. To identify Granger causality 
links, we consider the 95% level credible regions defined by the 
lag- joint distribution of VAR- coefficients of each group and we 
select the links associated to the regions that did not include the 
origin. Credible regions were delimited by the 0.025 contour line 
of the empirical histogram that approximates the lag- joint distri- 
bution. (Figure 2) shows the Granger causality networks by group 
and condition. In both groups, more connectivity links were 
observed in the active phase than during rest. During movement 
there is up -regulation of areas within the motor system, compared 
to rest, due to the internally driven processes required to move 
as well as the afferent feedback resulting from execution of the 
movement. The degree of unilateral connectivity is an observable 
difference between healthy control and stroke groups. As would 
be expected in non-injured brains, the control group exhibited 
connectivity patterns during movement that were largely con- 
fined to the hemisphere contralateral to movement. However, in 
the stroke group, additional connections were also observed in 
the contralesional hemisphere, specifically contralesional PMd. 
This is in-line with previous studies that report widespread acti- 
vation in the contralesional hemisphere, including contralesional 
PMd, during movement of the paretic hand (Rehme et al., 2012). 
Indeed, contralesional PMd has been to shown to play a compen- 
satory role in paretic arm movement (Johansen-Berg et al., 2002; 
Bestmann et al, 2010). 
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3.6.2. Network differences between control and stroke groups 

We determined the significant differences in Granger causality 
between the healthy control group and stroke patients; these 
differences are indicated with brown links in (Figure 2). In par- 
ticular we found a difference during the active condition in the 
influence of activity in the SMA on activity in the left PMd. In 
healthy controls, SMA activity predicted the subsequent activity 
in left PMd during movement of the right hand. In contrast, this 
connectivity link was not observed in the stroke patients. This 
result supports prior studies that indicate a reduction in ipsile- 
sional intrahemispheric connectivity after stroke (Rehme and 
Grefkes, 2013). 

For the stroke group, our findings of no significant connec- 
tivity between SMA and ipsilesional PMd may be related to 
a reduction of top-down control of motor movement in the 
impaired limb by the damaged hemisphere in patients with per- 
sisting post-stroke motor impairments. Dorsal premotor cortex 
is known to be important for higher order functions involved 
in motor planning (Rushworth et al., 2003; O'Shea et al., 2007; 
Wu and Hatsopoulos, 2007; Nakayama et al., 2008; Chouinard 
and Paus, 2010). Meanwhile, SMA has been implicated in inter- 
nally generated tasks (Tanji, 1994; Chen et al, 2009), bimanual 
coordination (Serrien et al., 2002; Steyvers et al., 2003), and 
programming of movements prior to execution (Roland et al., 
1980). As in the present study, Sharma et al. (2009) also found 
a significant reduction in connectivity between SMA and ipsile- 
sional premotor cortex in patients compared to controls during 
both paretic hand movement and motor imagery (Sharma et al., 
2009). However, a prior study found that there were no signifi- 
cant connectivity differences during movement between healthy 
controls and patients 3-6 months post stroke (Rehme et al., 
2011). Although not statistically significant, they did see a con- 
nection between contralateral SMA and PMd that was not present 
in patients at 3-6 months post-stroke. This non- significant dif- 
ference was also observed in a study of patients with subcor- 
tical stroke (Grefkes et al., 2008). The mixed findings may be 
attributable to the different effective connectivity models used in 
the analyses. Additional studies are needed to resolve the incon- 
sistent findings within the limited post- stroke fMRI effective 
connectivity literature. 

Comparing the region- specific model variances between the 
stroke patients and the healthy controls, we identified an increase 
in the variability for the ROIs in the patient control group relative 
to the healthy group, the region specific variances are given in 
(Table 1) 

3.6.3. Between-subject variation in effective connectivity 

The effective connectivity model also identified specific connec- 
tions that accounted for the greatest variability within groups. 
(Figure 3) presents the edges over which the top three largest 
group variances associated to the coefficients occurred. Large 
variability was also present in the motor network during the active 
phase. An overall picture of between subject standard deviations 
for all the considered links is presented in (Figure 4). 

In addition, more significant deviations were found in the 
patient group. This behavior could be explained by both the pres- 
ence of infarcts (an irregularity that is in the stroke but not patient 



Table 1 | Summary of posterior region-specific variances for the error 
term in the hierarchical model. 



Group 


ROI 


Q.025 


Q.50 


Q.975 


Healthy controls 


LM1 


5.5124 


5.988 


6.5183 




LPMd 


5.3635 


5.8234 


6.3431 




RM1 


4.5715 


4.9502 


5.3803 




RPMd 


6.1512 


6.7013 


7.2669 




SMA 


6.2479 


6.7828 


7.4008 


Stroke patients 


LM1 


12.985 


13.795 


14.676 




LPMd 


13.369 


14.173 


15.064 




RM1 


11.238 


11.929 


12.69 




RPMd 


9.2354 


9.8209 


10.445 




SMA 


9.7889 


10.383 


11.055 



Defined as the unique diagonal elements in the matrix V x . 



groups) and the randomness in the sites of these infarcts. A gen- 
eral picture of the links at which these subject specific deviations 
were identified is presented in (Figure 5). 

Subject- specific Granger causality networks can be 
constructed in a similar way to the group network by ana- 
lyzing the individual prediction parameters established in the 
model. To illustrate this point, we present two Granger causal- 
ity networks for two patients in the stroke group (Figure 6). 
Connectivity maps from these two patients did show reciprocal 
links between ipsilesional Ml and contralesional motor areas 
that were not seen at the group level. This is a connection 
that pervades the post-stroke connectivity literature. Stroke 
has been associated with increased inhibition from contrale- 
sional Ml onto ipsilesional Ml, as well as reduced inhibition 
of contralesional Ml by ipsilesional Ml (Murase et al., 2004; 
Rehme and Grefkes, 2013). A prior effective connectivity study 



Top 3 strongest variances (Stroke Patients) 

—►rest 




kCTIVE 



Top 3 strongest variances (Healthy Controls) 




FIGURE 3 | Links indicate the top 3 largest inter-subjects variances 
over connectivity coefficients for each group. Labels over links show the 
posterior mean value for the variance associated to the link. All variances 
represent the effect from the previous value (with the exception of the 
labeled as lag 2) in the indicated region to current value at the target region. 
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FIGURE 4 | Inter-subject standard deviations over connectivity coefficients for each group and condition. 



Links with subject specific deviations from the Stroke patients group mean 





FIGURE 5 | Links indicate that at least one significant subject-specific deviation was found on that connection. The thicker the link, the more number of 
significant deviations were identified on it. 



Granger Causality Subject 1 




did find that, compared to controls, stroke patients exhibited 
a negative influence of contralesional Ml to ipsilesional Ml 
during right hand movement (Grefkes et al., 2008). Considering 
the heterogeneity of the patient group, a larger study and a 
model that takes into account subject- specific anatomical infor- 
mation would improve statistical power and could resolve this 
finding. 



Granger Causality Subject 2 




FIGURE 6 | Granger causality subject-networks for two subjects in the 
stroke patients group. Links were determined by 95% credible regions of 
subject specific posterior means at lag 1 and 2. 



4. CONCLUSION 

We proposed the hierarchical VAR model to: (i) determine 
group and condition specific effective connectivity networks; 
(ii) compare connectivity between patient groups and between 
conditions; and (iii) investigate subject specific deviations 
from the network group. To control for the large parameter 
space and high correlation between the covariates of the VAR 
model, we used prior distributions so that the estimates are 
equivalent to that obtained from the elastic net penalization 
approach. 
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Because our prior distributions are equivalent to the elas- 
tic net penalization, our proposed model allows one to explore 
richer dependence structures (for example, fitting a VAR model 
of higher order) than the typical approach of using frequentist 
methods for fitting VAR models. This is a tremendous advantage 
especially when we do not have a large sample size such as the 
case of short fMRI time series. Even though our proposed mod- 
eling approach will regularize the parameter estimates in case the 
dimensionality of the parameter space is large, we still advise cau- 
tion with the use of an unreasonably large number of ROIs and 
model order, recalling that the number of parameters for each 
subject grows quadratically with the number of ROIs and linearly 
with the model order. As a practical implementation, we suggest 
fitting a VAR(l) first and then proceed with model diagnostics on 
the residuals. If the residuals appear like white noise (via auto- 
correlation plots) then a higher order is not necessary; otherwise, 
proceed to fitting a VAR with higher order and check the residuals. 

Another important issue is the choice of a "representative" 
time series within a ROI. Here, we used the mean time course 
within the ROI which is the most common approach. However, 
there are other potential approaches that we bring up for dis- 
cussion. First, one could build another level in the hierarchical 
structure so that the resulting hierarchy becomes: time series per 
voxel; voxels within ROIs; ROIs within a subject; subjects within 
a patient group. One could them introduce a local within-ROI 
spatial covariance structure using the ideas in Bowman (2005) 
and Kang et al. (2012). This procedure could of course introduce 
another layer of computational complexity. We did not pursue 
this approach here because the volumes of each ROI in our anal- 
ysis was small and the fMRI time series at voxels within each ROI 
are highly correlated, and as a result, any additional information 
gained will be small in comparison to the computational expense. 
A second alternative is to perform some dimension reduction step 
say, via principal components analysis, which will result in a few 
number of time series within an ROI that accounts for, say 80%, 
of the total variability within the ROI. A third alternative is to 
use summaries that are robust to outliers. Some examples include 
the median or trimmed means. A rigorous study that compares 
the current approach against these three alternatives is beyond the 
scope of this paper. 

The number of studies using effective connectivity to study 
post- stroke motor networks are relatively limited compared to 
traditional fMRI studies. Some key findings thus far include, 
increased connectivity from the prefrontal cortex within the 
ipsilesional hemisphere, reduced inhibition of the ipsilesional 
hemisphere onto contralesional Ml, and various excitatory 
increases/decreases within the ipsilesional hemisphere. The pro- 
posed VAR model elucidated the following: (1) specific differ- 
ences in patterns of effective connectivity between the chronic 
stroke and healthy control groups; and (2) quantification of 
between- subject variability in effective connectivity. These results 
could not have been derived from either DCM, SEM, or standard 
VAR models. The proposed model could be extended to include 
covariates (such as the volume of the infarct, amount of overlap 
between the stroke region and the cortico- spinal tract) and thus 
could potentially help to identify biomarkers of motor function 
improvements with post-stroke therapies. 
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